18 research outputs found

    Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques

    Get PDF
    Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction

    Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

    Full text link
    This paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given setting. In this paper we propose a novel strategy to automatically learn the optimal metric for PR. We unfold a recently introduced ADMM algorithm into a neural network, and we emphasize that the information about the loss used to formulate the PR problem is conveyed by the proximity operator involved in the ADMM updates. Therefore, we replace this proximity operator with trainable activation functions: learning these in a supervised setting is then equivalent to learning an optimal metric for PR. Experiments conducted with speech signals show that our approach outperforms the baseline ADMM, using a light and interpretable neural architecture.Comment: 10 pages, 5 figures, submitted to IEEE SP

    Sub-terahertz, microwaves and high energy emissions during the December 6, 2006 flare, at 18:40 UT

    Full text link
    The presence of a solar burst spectral component with flux density increasing with frequency in the sub-terahertz range, spectrally separated from the well-known microwave spectral component, bring new possibilities to explore the flaring physical processes, both observational and theoretical. The solar event of 6 December 2006, starting at about 18:30 UT, exhibited a particularly well-defined double spectral structure, with the sub-THz spectral component detected at 212 and 405 GHz by SST and microwaves (1-18 GHz) observed by the Owens Valley Solar Array (OVSA). Emissions obtained by instruments in satellites are discussed with emphasis to ultra-violet (UV) obtained by the Transition Region And Coronal Explorer (TRACE), soft X-rays from the Geostationary Operational Environmental Satellites (GOES) and X- and gamma-rays from the Ramaty High Energy Solar Spectroscopic Imager (RHESSI). The sub-THz impulsive component had its closer temporal counterpart only in the higher energy X- and gamma-rays ranges. The spatial positions of the centers of emission at 212 GHz for the first flux enhancement were clearly displaced by more than one arc-minute from positions at the following phases. The observed sub-THz fluxes and burst source plasma parameters were found difficult to be reconciled to a purely thermal emission component. We discuss possible mechanisms to explain the double spectral components at microwaves and in the THz ranges.Comment: Accepted version for publication in Solar Physic

    The Changing Landscape for Stroke\ua0Prevention in AF: Findings From the GLORIA-AF Registry Phase 2

    Get PDF
    Background GLORIA-AF (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients with Atrial Fibrillation) is a prospective, global registry program describing antithrombotic treatment patterns in patients with newly diagnosed nonvalvular atrial fibrillation at risk of stroke. Phase 2 began when dabigatran, the first non\u2013vitamin K antagonist oral anticoagulant (NOAC), became available. Objectives This study sought to describe phase 2 baseline data and compare these with the pre-NOAC era collected during phase 1. Methods During phase 2, 15,641 consenting patients were enrolled (November 2011 to December 2014); 15,092 were eligible. This pre-specified cross-sectional analysis describes eligible patients\u2019 baseline characteristics. Atrial fibrillation disease characteristics, medical outcomes, and concomitant diseases and medications were collected. Data were analyzed using descriptive statistics. Results Of the total patients, 45.5% were female; median age was 71 (interquartile range: 64, 78) years. Patients were from Europe (47.1%), North America (22.5%), Asia (20.3%), Latin America (6.0%), and the Middle East/Africa (4.0%). Most had high stroke risk (CHA2DS2-VASc [Congestive heart failure, Hypertension, Age  6575 years, Diabetes mellitus, previous Stroke, Vascular disease, Age 65 to 74 years, Sex category] score  652; 86.1%); 13.9% had moderate risk (CHA2DS2-VASc = 1). Overall, 79.9% received oral anticoagulants, of whom 47.6% received NOAC and 32.3% vitamin K antagonists (VKA); 12.1% received antiplatelet agents; 7.8% received no antithrombotic treatment. For comparison, the proportion of phase 1 patients (of N = 1,063 all eligible) prescribed VKA was 32.8%, acetylsalicylic acid 41.7%, and no therapy 20.2%. In Europe in phase 2, treatment with NOAC was more common than VKA (52.3% and 37.8%, respectively); 6.0% of patients received antiplatelet treatment; and 3.8% received no antithrombotic treatment. In North America, 52.1%, 26.2%, and 14.0% of patients received NOAC, VKA, and antiplatelet drugs, respectively; 7.5% received no antithrombotic treatment. NOAC use was less common in Asia (27.7%), where 27.5% of patients received VKA, 25.0% antiplatelet drugs, and 19.8% no antithrombotic treatment. Conclusions The baseline data from GLORIA-AF phase 2 demonstrate that in newly diagnosed nonvalvular atrial fibrillation patients, NOAC have been highly adopted into practice, becoming more frequently prescribed than VKA in Europe and North America. Worldwide, however, a large proportion of patients remain undertreated, particularly in Asia and North America. (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients With Atrial Fibrillation [GLORIA-AF]; NCT01468701

    Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques

    No full text
    Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction.La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio

    Reconstruction de phase et de signaux audio avec des fonctions de coût non-quadratiques

    No full text
    Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction.La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio

    Phase retrieval and audio signal reconstruction with non-quadratic cost functions

    No full text
    La reconstruction de signaux audio consiste à estimer des signaux sonores à partir de représentations incomplètes ou dégradées. Ce problème peut être formulé comme un problème inverse. Ces derniers sont fréquemment traités à l'aide de stratégies d'optimisation ou d'apprentissage automatique. Dans cette thèse, on propose de modifier la fonction de coût dans les problèmes inverses liés à la reconstruction de signaux audio. On considère principalement le problème de reconstruction de phase, un problème fréquent lors de la manipulation de spectrogrammes audio. Un premier axe de ces travaux étudie l'optimisation de fonctions de coût non-quadratiques pour la reconstruction de phase. Ce problème est étudié dans deux contextes: la reconstruction de signaux audio à partir d'un spectrogramme et la séparation de sources. Nous proposons une nouvelle formulation du problème à l'aide des divergences de Bregman, ainsi que des algorithmes pour leur résolution. Un second axe considère l'apprentissage de la fonction de coût à partir d'un jeu de données. On utilise le cadre des réseaux de neurones dépliés, obtenus à partir d'algorithmes itératifs. On propose un réseau de neurones construit via le dépliement de l'algorithme des directions alternées et incluant des fonctions d'activations paramétrées. On explicite la relation entre l'apprentissage de ses paramètres et de la fonction de coût pour la reconstruction de phase. Enfin, on conduit un travail expérimental pour chaque méthode exposée dans cette thèse afin d'évaluer leur performance et leur potentiel pour la reconstruction de signaux audio.Audio signal reconstruction consists in recovering sound signals from incomplete or degraded representations. This problem can be cast as an inverse problem. Such problems are frequently tackled with the help of optimization or machine learning strategies. In this thesis, we propose to change the cost function in inverse problems related to audio signal reconstruction. We mainly address the phase retrieval problem, which is common when manipulating audio spectrograms. A first line of work tackles the optimization of non-quadratic cost functions for phase retrieval. We study this problem in two contexts: audio signal reconstruction from a single spectrogram and source separation. We introduce a novel formulation of the problem with Bregman divergences, as well as algorithms for its resolution. A second line of work proposes to learn the cost function from a given dataset. This is done under the framework of unfolded neural networks, which are derived from iterative algorithms. We introduce a neural network based on the unfolding of the Alternating Direction Method of Multipliers, that includes learnable activation functions. We expose the relation between the learning of its parameters and the learning of the cost function for phase retrieval. We conduct numerical experiments for each of the proposed methods to evaluate their performance and their potential with audio signal reconstruction

    Phase retrieval with Bregman divergences and application to audio signal recovery

    Get PDF
    23 pages, 4 figures, submitted to the IEEE Journal of Selected Topics in Signal ProcessingInternational audiencePhase retrieval (PR) aims to recover a signal from the magnitudes of a set of inner products. This problem arises in many audio signal processing applications which operate on a short-time Fourier transform magnitude or power spectrogram, and discard the phase information. Recovering the missing phase from the resulting modified spectrogram is indeed necessary in order to synthesize time-domain signals. PR is commonly addressed by considering a minimization problem involving a quadratic loss function. In this paper, we adopt a different standpoint. Indeed, the quadratic loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. Therefore, we formulate PR as a new minimization problem involving Bregman divergences. We consider a general formulation that actually addresses two problems, since it accounts for the non-symmetry of these divergences in general. To optimize the resulting objective, we derive two algorithms based on accelerated gradient descent and alternating direction method of multiplier. Experiments conducted on audio signal recovery from either exact or modified spectrograms highlight the potential of our proposed methods for audio restoration. In particular, leveraging some of these Bregman divergences induce better performance than the quadratic loss when performing PR from highly degraded spectrograms

    Phase recovery with Bregman divergences for audio source separation

    Get PDF
    International audienceTime-frequency audio source separation is usually achieved by estimating the short-time Fourier transform (STFT) magnitude of each source, and then applying a phase recovery algorithm to retrieve time-domain signals. In particular, the multiple input spectrogram inversion (MISI) algorithm has shown good performance in several recent works. This algorithm minimizes a quadratic reconstruction error between magnitude spectrograms. However, this loss does not properly account for some perceptual properties of audio, and alternative discrepancy measures such as beta-divergences have been preferred in many settings. In this paper, we propose to reformulate phase recovery in audio source separation as a minimization problem involving Bregman divergences. To optimize the resulting objective, we derive a projected gradient descent algorithm. Experiments conducted on a speech enhancement task show that this approach outperforms MISI for several alternative losses, which highlights their relevance for audio source separation applications

    Learning the Proximity Operator in Unfolded ADMM for Phase Retrieval

    No full text
    International audienceThis paper considers the phase retrieval (PR) problem, which aims to reconstruct a signal from phaseless measurements such as magnitude or power spectrograms. PR is generally handled as a minimization problem involving a quadratic loss. Recent works have considered alternative discrepancy measures, such as the Bregman divergences, but it is still challenging to tailor the optimal loss for a given setting. In this paper we propose a novel strategy to automatically learn the optimal metric for PR. We unfold a recently introduced ADMM algorithm into a neural network, and we emphasize that the information about the loss used to formulate the PR problem is conveyed by the proximity operator involved in the ADMM updates. Therefore, we replace this proximity operator with trainable activation functions: learning these in a supervised setting is then equivalent to learning an optimal metric for PR. Experiments conducted with speech signals show that our approach outperforms the baseline ADMM, using a light and interpretable neural architecture
    corecore